Recipes updates

Reverse-chronological log of changes to the recipes cookbook. Newest at the top.

2026-06-14

Added

  • Screen a polypharmacy medication list for drug-drug interactions (Problem class: Knowledge synthesis; Evidence: Reported) β€” rung-2 DDInter skill recipe taking a medication list through per-drug ID resolution β†’ pairwise DDInter queries β†’ a cited severity/mechanism/management table with explicit β€œclean” lines, plus an optional rung-3 DailyMed + ClinPGx overlay on the major pairs. Drug Repurposing and Discovery focus-day recipe; cookbook’s first DDI-screening recipe. Reported β€” DomiΓ‘n et al., Explor. Res. Clin. Soc. Pharm. 2025 documents that ungrounded LLMs over-flag/hallucinate DDIs (Copilot 1,813 vs a 204-interaction reference on 57 real patients), establishing that screening must be anchored to a curated DDI database β€” the assembly this recipe recommends.
  • Run a GWAS on case-control genotype data (Problem class: Data analysis; Evidence: Proposed) β€” rung-2 PLINK2 skill recipe taking a PLINK/VCF genotype set through sample + variant QC (call rate, MAF, HWE-in-controls) β†’ LD pruning β†’ genotype PCA β†’ PCA-adjusted logistic-regression --glm association with a lambda_GC inflation check, handing genome-wide-significant loci to the GWAS Catalog skill for annotation. Translational Medicine focus-day recipe; cookbook’s first GWAS recipe. Proposed β€” no documented LLM-driven PLINK2 assembly; grounded in Chang et al., GigaScience 4:7 (2015) and the canonical QC tutorial (Marees et al., Int. J. Methods Psychiatr. Res. 27:e1608 (2018)).
  • Build a pharmacogenomic dosing report from a patient’s diplotypes (Problem class: Knowledge synthesis; Evidence: Proposed) β€” rung-2 ClinPGx skill recipe taking star-allele diplotypes plus a medication list through diplotypeβ†’metabolizer-phenotype translation (CPIC PostgREST API) β†’ per-drug CPIC/DPWG dosing recommendation lookup β†’ a cited drug gene phenotype recommendation table, with explicit β€œno actionable guidance” flagging and a DDInter phenoconversion overlay noted. Translational Medicine focus-day recipe; cookbook’s first pharmacogenomic-dosing recipe, distinct from the germline-pathogenicity variant-interpretation recipe. Proposed β€” no documented LLM-driven ClinPGx/CPIC assembly; grounded in the CPIC guideline corpus (Amstutz et al., Clin. Pharmacol. Ther. 2018; Molden & JukiΔ‡, Front. Pharmacol. 2021).
  • Profile a cancer cohort’s genomics with cBioPortal (Problem class: Knowledge synthesis; Evidence: Reported) β€” rung-2 cBioPortal skill recipe taking a study + gene set through study/profile lookup β†’ per-gene mutation+CNA alteration frequency and co-occurrence/mutual-exclusivity β†’ TMB summary β†’ a Kaplan-Meier overall-survival split by mutation status, with cohort-denominator caveats enforced. Translational Medicine focus-day recipe; cookbook’s first cohort-level cancer-genomics recipe, cross-linked to the gene-centric target dossier, single-variant variant-interpretation, and adjusted-modelling survival recipe. Reported β€” the cBioPortal-backed AI-HOPE conversational-agent family documents the assembly class (AI-HOPE-WNT, Front. Artif. Intell. 2025, recapitulating WNT-EOCRC survival p=0.0167/0.0007; AI-HOPE-TP53, Cancers 2025).

Verified (no changes)

  • Build a target dossier and Draft a Phase 2/3 clinical-trial protocol β€” linked catalog tools and key sources re-checked, last_verified bumped to 2026-06-14.
  • Assemble a tissue reference atlas from the CELLxGENE Census β€” linked catalog tools (cellxgene-census, scvi-tools, scanpy, anndata) and Census/scvi-hub sources re-checked, last_verified bumped to 2026-06-14.

2026-06-13

Added

  • Infer cell-cell communication from single-cell RNA-seq (Problem class: Data analysis; Evidence: Proposed) β€” rung-2 LIANA-MCP recipe taking an annotated AnnData object through ls_ccc_method β†’ multi-method communicate (CellPhoneDB/Connectome/NATMI/SingleCellSignalR) β†’ rank_aggregate consensus ligand-receptor tetrads β†’ circle_plot/ccc_dotplot, consuming the annotated object from the scRNA-seq QC recipe. Molecular and Cellular Biology focus-day recipe; cookbook’s first cell-cell-communication recipe. Proposed β€” no documented LLM-driven LIANA-MCP assembly; grounded in Dimitrov et al., Nat. Commun. 13:3735 (2022), a 2026 consensus-LIANA application (Wei et al., PLOS ONE 2026), and the method-disagreement benchmark (Xie et al., Biomolecules 13:1211 (2023)).
  • Call peaks and find enriched motifs from ChIP-seq or ATAC-seq (Problem class: Data analysis; Evidence: Proposed) β€” rung-3 toolbelt chaining the MACS3 skill (callpeak, narrow/broad mode β†’ narrowPeak BED) into the HOMER skill (annotatePeaks.pl nearest-gene context + findMotifsGenome.pl de-novo/known motif enrichment). Molecular and Cellular Biology focus-day recipe; the binding-site/motif companion to the deepTools signal-profiling recipe, which deliberately stops before peak calling. Proposed β€” no documented LLM-driven MACS3β†’HOMER assembly; grounded in the field-standard pipeline (Zhang et al., Genome Biol. 9:R137 (2008); Heinz et al., Mol. Cell 38:576 (2010)).
  • Analyze an existing MD trajectory for stability, flexibility, and contacts (Problem class: Data analysis; Evidence: Proposed) β€” rung-2 MDAnalysis skill recipe taking a finished GROMACS/AMBER/NAMD trajectory through a load-and-sanity-check β†’ aligned RMSD/RMSF/Rg β†’ interface contact map + H-bond occupancy β†’ backbone PCA battery, with the MDTraj skill as the DSSP/Ramachandran fallback. Integrative Structural and Computational Biology focus-day recipe; the post-simulation-analysis companion to the GROMACS setup recipe. Proposed β€” no documented LLM-driven MDAnalysis-skill assembly; grounded in Michaud-Agrawal et al., J. Comput. Chem. 32:2319 (2011), McGibbon et al., Biophys. J. 109:1528 (2015), and class-level agentic-MD evidence (MDCrow, Mach. Learn. Sci. Technol. 2025).
  • Scan a therapeutic antibody for glycosylation sites (Problem class: Experimental design; Evidence: Proposed) β€” rung-2 Glycoengineering skill recipe taking heavy/light-chain sequences through N-X-S/T sequon detection (flagging Fc Asn-297 vs unintended variable-domain sites) β†’ O-glycosylation hotspot prediction β†’ a parent-vs-variant sequon diff, with optional minimal site-knockout edit suggestions. Immunology and Microbiology focus-day recipe; cookbook’s first antibody-developability / glycosylation recipe. Proposed β€” no documented LLM-driven glycoengineering-skill assembly; grounded in 2026 Fc-glycan/ADCC literature (Shuang et al., mAbs 2026; IllΓ©s 2026) and the galactosylation-as-CQA reference (Klingler et al., Biotechnol. Bioeng. 2024).
  • Compute a bacterial pan-genome from a set of genome assemblies (Problem class: Data analysis; Evidence: Proposed) β€” rung-3 toolbelt chaining the Bakta skill (identical per-genome annotation β†’ GFF3) into the Roary skill (CD-HIT/BLAST/MCL clustering β†’ core/soft-core/shell/cloud partition, gene_presence_absence.csv, and a core_gene_alignment.aln that feeds the phylogenetics recipe). Immunology and Microbiology focus-day recipe; cookbook’s first comparative-genomics / pan-genome recipe. Proposed β€” no documented LLM-driven Baktaβ†’Roary assembly; grounded in the field-standard pipeline (Page et al., Bioinformatics 2015; Schwengers et al., Microb. Genom. 2021) and a 2025 27,884-genome application (Sholeh et al., Mol. Genet. Genomics 2025).

Verified (no changes)

  • 35 recipes spot-checked; all last_verified dates within the 30-day window, no aging recipes due.

2026-06-11

Added

  • Profile ChIP-seq or ATAC-seq signal around genomic features (Problem class: Data analysis; Evidence: Proposed) β€” rung-2 deepTools skill recipe taking aligned ChIP-seq/ATAC-seq BAMs through bamCoverage BPM-normalized bigWig generation β†’ multiBamSummary + plotCorrelation replicate QC β†’ computeMatrix + plotHeatmap/plotProfile TSS/peak-centered visualization, with upstream BAM handling via the pysam skill. Molecular and Cellular Biology focus-day recipe; cookbook’s first ChIP-seq/ATAC-seq coverage-profiling recipe. Proposed β€” no documented LLM-driven deepTools workflow; grounded in RamΓ­rez et al., NAR 44:W160 (2016) plus class-level Biomni.
  • Predict gene-knockout phenotypes with flux balance analysis (Problem class: Data analysis; Evidence: Proposed) β€” rung-2 COBRApy skill recipe taking a genome-scale SBML model through baseline FBA sanity-check β†’ genome-wide single_gene_deletion essentiality ranking β†’ focused double_gene_deletion synthetic-lethality screen, with an explicit growth-ratio essentiality threshold. Molecular and Cellular Biology focus-day recipe; cookbook’s first constraint-based metabolic-modelling recipe. Proposed β€” no documented LLM-driven COBRApy workflow; grounded in Ebrahim et al., BMC Syst. Biol. 7:74 (2013) and Orth et al., Nat. Biotechnol. 28:245 (2010), plus class-level Biomni.

Verified (no changes)

  • 33 recipes spot-checked; all last_verified dates within the 30-day window, no aging recipes due.

2026-06-10

Added

  • Score point mutations for functional impact with a protein language model (Problem class: Data analysis; Evidence: Proposed) β€” rung-2 ESM skill recipe taking a wild-type protein sequence (optionally fetched by UniProt accession via the gget skill) and a list of substitutions through masked-marginal log-likelihood-ratio scoring β†’ a ranked tolerated/deleterious CSV, with a wt-marginal one-pass variant for full single-mutation landscapes. Integrative Structural and Computational Biology focus-day recipe; cookbook’s first zero-shot variant-effect / protein-fitness recipe and the database-free complement to the clinical-variant interpretation recipe. Proposed β€” no documented LLM-driven ESM-skill scoring assembly; grounded in the canonical zero-shot method Meier et al., NeurIPS 2021, the ProteinGym benchmark, and 2025 directed-evolution use Zhang et al., Nat. Commun. 2025.

Verified (no changes)

  • 31 recipes spot-checked; all last_verified dates within the 30-day window, no aging recipes due.

2026-06-09

Added

Updated

  • Estimate pharmacokinetic properties of a small molecule β€” promoted Proposed β†’ Reported on the first field report (issue #12). A user ran the full three-layer assembly through to a finished PK card and captured it in a standalone pk_card.py, verified across caffeine, ibuprofen, quercetin, and terfenadine. Added a Field reports subsection under Evidence and refreshed last_verified to 2026-06-09.

Verified (no changes)

User requests

  • #12 @goodb β€” resolved. This entry had been stuck open since 2026-05-27 because the responder emitted no machine-readable trailer, so the request content lived only in the GitHub issue body β€” which the sandboxed curator agent (no gh/shell) could not read, leaving it β€œun-actionable” on every retry. Fixed at the source: the recipes.yml / curate.yml workflows now pre-fetch open user-request issue bodies into .request-bodies/<NN>.md before the agent runs, the responder fallback now rebuilds a structured queue entry from the issue-form fields, and RECIPE_AGENT.md / AGENT.md point the agent at the pre-fetched files instead of a gh issue view it can’t run.

2026-06-08

Added

  • Identify an unknown compound from an MS/MS spectrum (Problem class: Data analysis; Evidence: Proposed) β€” rung-2 matchms skill recipe taking experimental tandem-MS spectra plus a reference library (GNPS / MassBank / in-house .msp) through format import β†’ peak cleaning and metadata harmonization β†’ modified-cosine scoring with precursor-m/z gating β†’ a ranked candidate-identity CSV, handing confirmed InChIKeys off to the PubChem MCP and the polypharmacology recipe. Chemistry focus-day recipe; cookbook’s first metabolomics / spectral-library-matching recipe. Proposed β€” no documented LLM-driven matchms workflow; grounded in the canonical library paper Huber et al., JOSS 5(52):2411 (2020) plus methodological anchors Onoprishvili et al., Bioinformatics (2025) (SimMS) and Xing et al., Anal. Chem. (2025) (enhanced reverse spectral search).

Verified (no changes)

  • Aging-recipe sweep: oldest last_verified is 2026-05-24 (15 days), within the 30-day window β€” no recipes due for re-verification this run.

User requests

  • #12 (@goodb) β€” still no gh permission to read the issue body from this run; left open for next-run retry.

2026-06-07

Added

  • Enumerate analogs around a lead compound for SAR expansion (Problem class: Hypothesis generation; Evidence: Proposed) β€” rung-2 Datamol skill recipe taking a lead SMILES through standardization β†’ tautomer / stereoisomer enumeration β†’ single-point fragment-substitution scan β†’ ECFP4 Tanimoto + QED scoring β†’ a deduplicated SAR-expansion CSV, with explicit handoff to the VS-hit-filtering developability gate and the polypharmacology bioactivity lookup. Drug Repurposing and Discovery focus-day recipe; cookbook’s first dedicated analog-enumeration / lead-optimisation recipe and the natural upstream of the existing hit-filtering recipe; cookbook’s second Hypothesis generation recipe. Proposed β€” no documented LLM-driven Datamol enumeration workflow; closest grounding is the K-Dense rdkitβ†’datamolβ†’medchem lead-optimisation workflow plus the underlying primitives Rogers & Hahn, JCIM 50:742 (2010) (ECFP/Tanimoto), Bickerton et al., Nat. Chem. 4:90 (2012) (QED), and Griffen et al., J. Med. Chem. 54:7739 (2011) (matched molecular pairs).

Updated

  • Nav orders rebalanced to keep alphabetical title ordering after the new addition. β€œEnumerate analogs…” inserted at 10; everything from β€œEstimate pharmacokinetic properties” downward shifted +1 (Estimate β†’ 11, Filter VS hits β†’ 12, Infer GRN β†’ 13, Integrate single-cell β†’ 14, Interpret variant β†’ 15, Match patient β†’ 16, Organize DICOM β†’ 17, Parse FCS β†’ 18, Prioritize targets β†’ 19, Profile polypharmacology β†’ 20, Run bulk RNA-seq β†’ 21, Run first-pass QC β†’ 22, Run functional enrichment β†’ 23, Scan repurposing β†’ 24, Set up MD β†’ 25, Sort spikes β†’ 26, Triage preprints β†’ 27, Triage AlphaFold β†’ 28, Fit survival β†’ 29, Scan adverse events β†’ 30).

Verified (no changes)

  • 29 existing recipes spot-checked; none past the 30-day last_verified window (oldest is 2026-05-24, profile-compound-polypharmacology), so no re-verification was due this run.

2026-06-06

Added

  • Fit a survival model to censored clinical outcomes (Problem class: Data analysis; Evidence: Proposed) β€” rung-2 scikit-survival skill recipe taking a tidy covariate table plus a (time, event) outcome through structured-Surv encoding β†’ Kaplan-Meier + log-rank β†’ Cox PH (with a proportional-hazards check) β†’ Random Survival Forest β†’ cross-validated Harrell’s c-index β†’ risk-group stratification. First Translational Medicine focus-day recipe of this run; cookbook’s first dedicated time-to-event / prognosis recipe. Proposed β€” no documented end-to-end LLM-driven sksurv workflow; closest grounding is the library reference PΓΆlsterl, JMLR 21(212):1–6 (2020) and recent RSF-vs-nomogram prognosis studies Zhang et al., Transl. Cancer Res. (2026) and Liu et al., Medicine (2026).
  • Scan adverse-event reports for a drug-safety signal (Problem class: Knowledge synthesis; Evidence: Proposed) β€” rung-2 OpenFDA MCP recipe taking a drug name through generic-name resolution β†’ FAERS top-reaction ranking β†’ structured label / warning pull β†’ label-vs-FAERS cross-check β†’ an honest β€œreports, not rates” framing. Second Translational Medicine focus-day recipe of this run; promoted from the Deferred β€” next-run priority list; cookbook’s first pharmacovigilance recipe. Proposed β€” no documented attempt of this exact MCP assembly; openFDA/FAERS is the canonical public pharmacovigilance source and the server wraps it faithfully.

Verified (no changes)

  • 27 existing recipes spot-checked; none past the 30-day last_verified window (oldest is 2026-05-24), so no re-verification was due this run.

2026-06-05

Added

Updated

  • Nav orders rebalanced to keep alphabetical title ordering after the new addition and to fix a stale collision between Run first-pass QC and Run functional enrichment (both stamped 20). β€œOrganize a raw DICOM dataset…” inserted at 16; everything from β€œParse FCS…” downward shifted by +1, with Run first-pass QC at 21 and Run functional enrichment at 22: Parse FCS flow-cytometry files β†’ 17, Prioritize targets β†’ 18, Profile polypharmacology β†’ 19, Run bulk RNA-seq DE β†’ 20, Run first-pass QC β†’ 21, Run functional enrichment β†’ 22, Scan repurposing β†’ 23, Set up protein MD β†’ 24, Sort spikes β†’ 25, Triage preprints β†’ 26, Triage AlphaFold β†’ 27.

Verified (no changes)

  • No aging recipes due β€” every last_verified date is within the 30-day window. The verification floor sits at 2026-05-24 (scan-drug-repurposing-candidates); next aging boundary is 2026-06-23.

User requests

  • #12 @goodb β€” still cannot access the issue body (no gh permission for the repo in this run); leaving open in recipes/curator-state.md for the next run with gh access.

2026-06-04

Added

  • Run functional enrichment on a gene list (Problem class: Data analysis; Evidence: Reported) β€” rung-2 gget skill recipe taking a list of gene symbols through gget enrichr against GO BP, KEGG, Reactome, MSigDB Hallmark, and DisGeNET β†’ per-library CSV β†’ grounded natural-language summary with explicit verification pass against the saved tables and a random-gene negative-control step. First Molecular and Cellular Biology focus-day recipe of this run; the cookbook’s first dedicated functional-enrichment / pathway-interpretation recipe and the natural downstream step after bulk RNA-seq DE. Reported evidence anchored in Wang et al., GeneAgent, Nature Methods 22:1677, 2025 β€” self-verification against Enrichr and curated databases lifts ROUGE-L on MSigDB from 0.239Β±0.038 (GPT-4) to 0.310Β±0.047 (GeneAgent) across 1,106 gene sets, with 84% of 15,848 claims database-supported and 92% of self-verification decisions correct on a 132-claim expert-judged sample; complementary anchors Hu et al., Nat. Methods 21:2353, 2024 and Joshi et al., llm2geneset (bioRxiv 2024-11-12).

Verified (no changes)

User requests

  • #12 @goodb β€” still cannot access the issue body (no gh permission in this run); leaving open in recipes/curator-state.md for the next run with gh access.

2026-06-03

Added

Updated

  • Nav orders rebalanced to keep alphabetical title ordering after the new addition. β€œDock a ligand library…” inserted at 8; everything from β€œDraft Phase 2/3…” downward shifted by +1: Draft Phase 2/3 clinical-trial protocol β†’ 9, Estimate PK β†’ 10, Filter virtual screening β†’ 11, Infer GRN β†’ 12, Integrate single-cell β†’ 13, Interpret clinical variant β†’ 14, Match patient to trials β†’ 15, Parse FCS flow-cytometry files β†’ 16, Prioritize targets β†’ 17, Profile polypharmacology β†’ 18, Run bulk RNA-seq DE β†’ 19, Run first-pass QC β†’ 20, Scan repurposing β†’ 21, Set up protein MD β†’ 22, Sort spikes β†’ 23, Triage preprints β†’ 24, Triage AlphaFold β†’ 25.

Verified (no changes)

  • No aging recipes due β€” every last_verified date is within the 30-day window. The recipe set’s verification floor sits at 2026-05-22 (integrate-single-cell-datasets, sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.

User requests

  • #12 @goodb β€” still cannot access the issue body (no gh permission for the repo in this run); leaving the request open in recipes/curator-state.md for the next run with gh access.

2026-06-02

Added

  • Compute 16S microbiome alpha/beta diversity from a BIOM table (Problem class: Data analysis; Evidence: Proposed) β€” rung-2 scikit-bio skill recipe taking a BIOM feature table + sample metadata + Newick tree through rarefaction β†’ Shannon/Simpson/Faith’s PD β†’ weighted/unweighted UniFrac β†’ PCoA β†’ PERMANOVA with explicit grouping-column and permutation-count flags. First Immunology and Microbiology focus-day recipe of this run; cookbook’s first dedicated microbiome / community-ecology recipe. Proposed because no documented end-to-end attempt of this exact assembly exists; closest class-level evidence is Huang et al. Biomni (bioRxiv 2025.05.30.656746) whose published benchmark includes microbiome disease-taxa bioinformatics across five datasets (HMP, MetaPhlAn2 human metagenomics, drinking-water OTU matrices) at ~4Γ— over base-LLM accuracy.
  • Parse FCS flow-cytometry files for downstream immunophenotyping (Problem class: Data analysis; Evidence: Proposed) β€” rung-2 FlowIO skill recipe taking a directory of vendor-emitted FCS 2.0/3.0/3.1 files through FlowData parsing β†’ per-file metadata harvest β†’ scatter/fluorescence/time channel categorisation β†’ optional log/gain transforms β†’ concatenated long-format events Parquet, with explicit failure surfacing for partial-acquisition files. Second Immunology and Microbiology focus-day recipe; cookbook’s first cytometry / FCS recipe. Proposed because no documented end-to-end attempt of this exact assembly exists; closest class-level evidence is β€œEnhancing Clinical Workflow Efficiency in Flow Cytometry Reporting with LLMs” (PMC13053331, J. Clin. Immunol. 2026), which demonstrates pathologist-level accuracy of fine-tuned LLMs on the downstream report-generation step the parsed-events output feeds into.

Updated

  • Nav orders rebalanced to keep alphabetical title ordering after the two additions: Assemble Census atlas β†’ 1, Benchmark ADMET β†’ 2, Build target dossier β†’ 3, Compute 16S microbiome diversity β†’ 4 (new), Compute HRV β†’ 5, Convert instrument data β†’ 6, Discover NWB on DANDI β†’ 7, Draft Phase 2/3 clinical-trial protocol β†’ 8, Estimate PK β†’ 9, Filter virtual screening β†’ 10, Infer GRN β†’ 11, Integrate single-cell β†’ 12, Interpret clinical variant β†’ 13, Match patient to trials β†’ 14, Parse FCS flow-cytometry files β†’ 15 (new), Prioritize targets β†’ 16, Profile polypharmacology β†’ 17, Run bulk RNA-seq DE β†’ 18, Run first-pass QC β†’ 19, Scan repurposing β†’ 20, Set up protein MD β†’ 21, Sort spikes β†’ 22, Triage preprints β†’ 23, Triage AlphaFold β†’ 24.

Verified (no changes)

  • No aging recipes due β€” every last_verified date is within the 30-day window. The recipe set’s verification floor sits at 2026-05-22 (integrate-single-cell-datasets, sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.

User requests

  • #12 (claude:recipe-feedback) β€” remains in ## User requests (open); gh CLI is still not available in this run’s environment so the issue body cannot be inspected. Retry next run with gh access.

2026-06-01

Added

  • Convert raw analytical instrument data to Allotrope ASM JSON (Problem class: Workflow automation; Evidence: Reported) β€” rung-2 instrument-data-to-allotrope skill recipe taking a vendor-format file (cell counter, plate reader, HPLC, MS, qPCR) through auto-detect β†’ allotropy native parse β†’ ASM JSON-LD + flattened CSV + exportable Python parser, with strict-validation of the raw-vs-derived split before LIMS / data-lake handoff. First Chemistry focus-day recipe of this run; cookbook’s first workflow-automation recipe spanning the Anthropic life-sciences plugin family. Anchored in the Claude for Life Sciences launch (October 2025), the Anthropic Vi-CELL tutorial, and the underlying Benchling-Open-Source/allotropy reference parser.
  • Set up a protein molecular dynamics simulation in GROMACS from a PDB ID (Problem class: Experimental design; Evidence: Proposed) β€” rung-2 molecule-mcp recipe driving the GROMACS Copilot server end-to-end (topology β†’ solvation β†’ ion neutralisation β†’ minimisation β†’ NVT/NPT β†’ 50 ns production β†’ RMSD/RMSF/Rg) with explicit force-field / water-model / GPU-offload flags. Second Chemistry focus-day recipe; first cookbook entry exercising the GROMACS path of the molecule-mcp bundle. Proposed because no documented end-to-end attempt of this exact assembly exists; closest peer-reviewed class-level evidence is MDCrow (Campbell et al., Mach. Learn. Sci. Technol. 2025, DOI:10.1088/2632-2153/ae4b07) β€” OpenMM rather than GROMACS but same architecture β€” plus GROMACS-supporting follow-ons DynaMate (arXiv:2512.10034) and NAMD-Agent (arXiv:2507.07887), and the MDGym benchmark (arXiv:2605.08941) as a reality check (Claude Code / Codex / OpenHands all solve <21% of easy GROMACS/LAMMPS tasks).

Updated

  • Nav orders rebalanced to restore strict alphabetical title ordering after the two additions and to correct two prior off-by-many drifts (Benchmark ADMET was at 20 instead of 2; Prioritize Targets was at 19 instead of 14): Assemble Census atlas β†’ 1, Benchmark ADMET β†’ 2, Build target dossier β†’ 3, Compute HRV β†’ 4, Convert instrument data β†’ 5 (new), Discover NWB on DANDI β†’ 6, Draft a Phase 2/3 clinical-trial protocol β†’ 7, Estimate PK β†’ 8, Filter virtual screening β†’ 9, Infer GRN β†’ 10, Integrate single-cell β†’ 11, Interpret clinical variant β†’ 12, Match patient to trials β†’ 13, Prioritize targets β†’ 14, Profile polypharmacology β†’ 15, Run bulk RNA-seq DE β†’ 16, QC single-cell β†’ 17, Scan repurposing β†’ 18, Set up protein MD in GROMACS β†’ 19 (new), Sort spikes β†’ 20, Triage preprints β†’ 21, Triage AlphaFold β†’ 22.
  • recipes/curator-state.md β€” ## Missing components entry for β€œDeepChem (K-Dense Skill)” removed; DeepChem is now catalogued at catalog/tools/deepchem.md.

Verified (no changes)

  • No aging recipes due β€” every last_verified date is within the 30-day window. The recipe set’s verification floor sits at 2026-05-22 (integrate-single-cell-datasets, sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.

User requests

  • #12 (claude:recipe-feedback) β€” remains in ## User requests (open); gh CLI is still not available in this run’s environment so the issue body cannot be inspected. Retry next run with gh access.

2026-05-31

Added

Verified (no changes)

  • No aging recipes due β€” every last_verified date is within the 30-day window. The recipe set’s verification floor sits at 2026-05-22 (integrate-single-cell-datasets, sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.

User requests

  • #12 (claude:recipe-feedback) β€” remains in ## User requests (open); gh CLI is still not available in this run’s environment so the issue body cannot be inspected. Retry next run with gh access.

2026-05-30

Added

  • Draft a Phase 2/3 clinical-trial protocol from an indication brief (Problem class: Manuscript prep; Evidence: Reported) β€” rung-2 clinical-trial-protocol Anthropic Healthcare plugin recipe that walks an indication / endpoint paragraph through the four-waypoint flow β€” regulatory classification, ClinicalTrials.gov competitive landscape, sample-size calculation, FDA/NIH-template drafting β€” emerging with a reviewable draft Phase 2/3 protocol scaffold. First Translational Medicine focus-day recipe of the new run; resolves a previously deferred candidate. Evidence anchored in the Anthropic plugin tutorial (Claude for Healthcare launch, January 2026) and class-level validation in Markey et al. Clinical Trials 2025 (80% content relevance, >99% terminology accuracy with RAG), Shin et al. Clinical Pharmacology & Therapeutics 2026 (100% accuracy on disease/intervention/comparator extraction, 14/15 trials for sample-size identification), Hauptman et al. JMIR Dermatology 2026, and Maleki, arXiv 2404.05044 (2024).

Updated

  • Nav orders rebalanced across the recipe set to keep alphabetical ordering after the addition: Assemble Census atlas β†’ 1, Build target dossier β†’ 2, Compute HRV β†’ 3, Discover NWB on DANDI β†’ 4, Draft a Phase 2/3 clinical-trial protocol β†’ 5 (new), Estimate PK β†’ 6, Filter virtual screening β†’ 7, Infer GRN β†’ 8, Integrate single-cell β†’ 9, Interpret clinical variant β†’ 10, Match patient to trials β†’ 11, Profile polypharmacology β†’ 12, Run bulk RNA-seq DE β†’ 13, QC single-cell β†’ 14, Scan repurposing β†’ 15, Sort spikes β†’ 16, Triage preprints β†’ 17, Triage AlphaFold β†’ 18.

Verified (no changes)

  • No aging recipes due β€” every last_verified date is within the 30-day window. The recipe set’s verification floor sits at 2026-05-22 (integrate-single-cell-datasets, sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.

User requests

  • #12 (claude:recipe-feedback) β€” remains in ## User requests (open); gh CLI is still not available in this run’s environment so the issue body cannot be inspected. Retry next run with gh access.

2026-05-29 (second pass β€” Neuroscience directed)

Added

Updated

  • Nav orders rebalanced across the recipe set to keep alphabetical ordering after the addition: Assemble Census atlas β†’ 1, Build target dossier β†’ 2, Compute HRV β†’ 3, Discover NWB on DANDI β†’ 4, Estimate PK β†’ 5, Filter virtual screening β†’ 6, Infer GRN β†’ 7, Integrate single-cell β†’ 8, Interpret clinical variant β†’ 9, Match patient to trials β†’ 10, Profile polypharmacology β†’ 11, Run bulk RNA-seq DE β†’ 12, QC single-cell β†’ 13, Scan repurposing β†’ 14, Sort spikes β†’ 15, Triage preprints β†’ 16, Triage AlphaFold β†’ 17.

Verified (no changes)

  • No aging recipes this run β€” every last_verified date is within the 30-day window. The recipe set’s verification floor sits at 2026-05-22 (integrate-single-cell-datasets, sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.

User requests

  • #12 (claude:recipe-feedback) β€” remains in ## User requests (open); gh CLI still unavailable in this run’s environment so the issue body cannot be inspected. Retry next run with gh access.

2026-05-29

Added

Updated

  • Nav orders rebalanced across the recipe set to keep alphabetical ordering after the addition: Assemble Census atlas β†’ 1, Build target dossier β†’ 2, Compute HRV β†’ 3, Estimate PK β†’ 4, Filter virtual screening β†’ 5, Infer GRN β†’ 6, Integrate single-cell β†’ 7, Interpret clinical variant β†’ 8, Match patient to trials β†’ 9, Profile polypharmacology β†’ 10, Run bulk RNA-seq DE β†’ 11, QC single-cell β†’ 12, Scan repurposing β†’ 13, Sort spikes β†’ 14, Triage preprints β†’ 15, Triage AlphaFold β†’ 16.

Verified (no changes)

  • 4 recipes spot-checked at the 30-day boundary and bumped to last_verified: 2026-05-29 β€” Triage preprints, QC single-cell, Build target dossier, Run bulk RNA-seq DE. All linked catalog tools (bio-research, pubmed, single-cell-rna-qc, pydeseq2, open-targets, uniprot, alphafold, depmap) remain present and unflagged.

User requests

  • #12 (claude:recipe-feedback) β€” remains in ## User requests (open); gh CLI is not available in this run’s environment so the issue body still cannot be inspected. Retry on the next run that has gh access.

2026-05-28

Added

Updated

  • Nav orders rebalanced across the recipe set to keep alphabetical ordering after the two additions: Assemble Census atlas β†’ 1, Build target dossier β†’ 2, Estimate PK β†’ 3, Filter virtual screening β†’ 4, Infer GRN β†’ 5, Integrate single-cell β†’ 6, Interpret clinical variant β†’ 7, Match patient to trials β†’ 8, Profile polypharmacology β†’ 9, Run bulk RNA-seq DE β†’ 10, QC single-cell β†’ 11, Scan repurposing β†’ 12, Sort spikes β†’ 13, Triage preprints β†’ 14, Triage AlphaFold β†’ 15.

Missing components flagged to the catalog curator

  • pySCENIC wrapper (cisTarget + AUCell) β€” would unlock the full SCENIC pipeline downstream of the new GRN-inference recipe (motif filtering against cisTarget databases, per-cell regulon AUCell scoring).

Verified (no changes)

  • All 13 pre-existing recipes have last_verified within the 30-day window (oldest 2026-05-21); no aging verifications were due this run.

2026-05-27

Added

  • Estimate pharmacokinetic properties of a small molecule (Problem class: Knowledge synthesis; Evidence: Proposed) β€” rung-3 RDKit + MedChem + ChEMBL assembly producing a descriptor / rule-based / analog-anchored PK card for a single SMILES. Ships in response to user request #8. Closest documented analogues: ChemCrow (Bran et al., Nature Machine Intelligence 2024) and PharmaBench (Niu et al., Scientific Data 2024).
  • Triage an AlphaFold model for structure-based drug design (Problem class: Knowledge synthesis; Evidence: Proposed) β€” rung-2 AlphaFold MCP recipe producing a pLDDT-anchored go/refine/fall-back-to-PDB verdict on a UniProt accession. First Integrative Structural and Computational Biology-primary recipe. Evidence grounded in the EBI AlphaFold DB papers (Varadi 2022, Varadi 2024), the interface-pLDDT benchmark (Bryant 2022), and the AlphaFold-for-docking assessment (Karelina 2023).

Updated

  • Nav orders rebalanced across the recipe set to keep alphabetical ordering after the two additions: Estimate PK properties β†’ 2, Filter virtual screening hits β†’ 3, Integrate single-cell datasets β†’ 4, Interpret clinical variant β†’ 5, Match patient to trials β†’ 6, Profile polypharmacology β†’ 7, Run bulk RNA-seq DE β†’ 8, QC single-cell RNA-seq β†’ 9, Scan repurposing candidates β†’ 10, Sort spikes β†’ 11, Triage preprints β†’ 12, Triage AlphaFold model β†’ 13.

Missing components flagged to the catalog curator

  • ADMET-AI / AdmetLab 3.0 / Deep-PK wrapper β€” would let the new PK-properties recipe move from descriptor-and-analog estimation to defensible ML prediction for CYP / hERG / microsomal endpoints.
  • DeepChem (K-Dense Skill) β€” already flagged in the catalog curator’s state; would also strengthen the PK-properties recipe.
  • Co-folding / AlphaFold-Multimer / Boltz-2 wrapper β€” would unlock a complex-modelling companion to the AlphaFold triage recipe.

Verified (no changes)

  • All recipes have last_verified within the 30-day window; no aging verifications were due this run.

2026-05-25

Added

  • Filter a virtual screening hit list with drug-likeness rules and structural alerts (Problem class: Data analysis; Evidence: Reported) β€” rung-2 MedChem + Datamol cascade for Lipinski β†’ Veber β†’ PAINS β†’ BRENK triage of SMILES hit lists. First Chemistry-primary recipe in the cookbook. Evidence anchored in the K-Dense lead-optimisation workflow and the foundational filter papers (Baell & Holloway PAINS 2010, Brenk 2008, Lipinski 2001, Veber 2002).
  • Profile a compound’s polypharmacology from ChEMBL bioactivity data (Problem class: Knowledge synthesis; Evidence: Reported) β€” rung-2 single-tool recipe over the ChEMBL connector. Second Chemistry-primary recipe and the compound-centric mirror of the existing target-dossier recipe. Evidence grounded in the Anthropic ChEMBL Connector tutorial and the ChEMBL curation paper (Mendez et al., NAR 2019).

Updated

  • Integrate multiple single-cell RNA-seq datasets across batches β€” nav_order 2 β†’ 3 for alphabetical position after the new Filter recipe.
  • Interpret a clinical variant from a natural-language query β€” nav_order 3 β†’ 4.
  • Match a patient summary to recruiting clinical trials β€” nav_order 4 β†’ 5.
  • Run bulk RNA-seq differential expression from a counts matrix β€” nav_order 5 β†’ 7 (after the new Profile recipe).
  • Run first-pass QC on a single-cell RNA-seq dataset β€” nav_order 6 β†’ 8.
  • Scan approved drugs for repurposing candidates against a disease β€” nav_order 7 β†’ 9.
  • Sort spikes from a Neuropixels recording end-to-end β€” nav_order 8 β†’ 10.
  • Triage a stack of new preprints in your field β€” nav_order 9 β†’ 11.

Verified (no changes)

  • 9 existing recipes spot-checked; all last_verified dates within the 30-day window, all linked catalog pages resolve.

2026-05-24

Added

Updated

  • Sort spikes from a Neuropixels recording end-to-end β€” nav_order 7 β†’ 8 for alphabetical position.
  • Triage a stack of new preprints in your field β€” nav_order 8 β†’ 9 for alphabetical position.

Verified (no changes)

  • 8 existing recipes spot-checked; all last_verified dates within the 30-day window, all linked catalog pages resolve.

2026-05-23

Added

  • Match a patient summary to recruiting clinical trials (Problem class: Knowledge synthesis; Evidence: Reported) β€” rung-2 BioMCP / cyanheads-ClinicalTrials.gov-MCP recipe; first Translational-Medicine-focused recipe in the cookbook. Evidence grounded in TrialGPT (Jin et al., Nature Communications 2024, 87.3% criterion-matching accuracy).
  • Interpret a clinical variant from a natural-language query (Problem class: Knowledge synthesis; Evidence: Proposed) β€” rung-2 BioMCP recipe; pairs with the trial-matching recipe for variant-driven enrollment. Closest analogous benchmark is MARRVEL-MCP (bioRxiv 2025-11).

Updated

  • Run bulk RNA-seq differential expression from a counts matrix β€” nav_order 3 β†’ 5 for alphabetical position after the two new TM recipes.
  • Run first-pass QC on a single-cell RNA-seq dataset β€” nav_order 4 β†’ 6 for alphabetical position.
  • Sort spikes from a Neuropixels recording end-to-end β€” nav_order 5 β†’ 7 for alphabetical position.
  • Triage a stack of new preprints in your field β€” nav_order 6 β†’ 8 for alphabetical position.

Verified (no changes)

  • 5 existing recipes spot-checked; all last_verified dates within the 30-day window, all linked catalog pages resolve.

2026-05-22

Added

  • Integrate multiple single-cell RNA-seq datasets across batches (Problem class: Data analysis; Evidence: Reported) β€” rung-2 recipe wrapping the Anthropic scvi-tools skill for scVI / scANVI batch integration; written in response to user request #7; evidence grounded in Hrovatin 2025 and scIB-E 2025 (source).
  • Sort spikes from a Neuropixels recording end-to-end (Problem class: Data analysis; Evidence: Reported) β€” rung-2 recipe wrapping the K-Dense neuropixels-analysis skill (SpikeInterface + Kilosort4); first Neuroscience-only recipe in the cookbook (source).

Updated

  • Run bulk RNA-seq differential expression from a counts matrix β€” nav_order shifted 2 β†’ 3 for alphabetical position.
  • Run first-pass QC on a single-cell RNA-seq dataset β€” nav_order shifted 3 β†’ 4 for alphabetical position.
  • Triage a stack of new preprints in your field β€” nav_order shifted 4 β†’ 6 for alphabetical position.

Verified (no changes)

  • 4 existing recipes spot-checked (all linked catalog pages resolve; last_verified 2026-05-21 still within the 30-day window so no bumps).

2026-05-21

Added

  • Run first-pass QC on a single-cell RNA-seq dataset (Problem class: Data analysis; Evidence: Reported) β€” rung-2 recipe wrapping Anthropic’s single-cell-rna-qc skill for canonical scverse MAD-based filtering of 10x .h5 / AnnData .h5ad inputs (source).
  • Run bulk RNA-seq differential expression from a counts matrix (Problem class: Data analysis; Evidence: Reported) β€” rung-2 recipe wrapping the K-Dense PyDESeq2 skill for negative-binomial GLM differential expression, including pseudobulk single-cell handoff guidance (source).
  • Build a target dossier from gene name to structure to cancer dependency (Problem class: Knowledge synthesis; Evidence: Proposed) β€” first rung-3 toolbelt recipe composing Open Targets, UniProt, AlphaFold, and DepMap into a one-page target dossier; first Proposed-evidence entry in the cookbook (closest analogue).

Updated

  • Triage a stack of new preprints in your field β€” nav_order shifted from 1 to 4 to reflect alphabetical ordering after the three new Mol/Cell Bio additions; no content changes.

Verified (no changes)

  • 1 recipe spot-checked, current (triage-new-preprints, last_verified 2026-05-21).

2026-05-21 (initial seed)

Added

  • Section bootstrap β€” recipes/ section created with landing page, landscape page, and the all-recipes index; recipes/curator-state.md initialized; RECIPES_CHANGELOG.md (this file) created. Curator prompt and daily workflow added at RECIPE_AGENT.md and .github/workflows/recipes.yml.
  • Triage a stack of new preprints in your field (Problem class: Literature triage; Evidence: Reported) β€” first seed recipe demonstrating the schema and the lowest rung of the simplicity ladder (Claude Code alone + bioRxiv MCP) (source).