Recipes updates
Reverse-chronological log of changes to the recipes cookbook. Newest at the top.
2026-06-14
Added
- Screen a polypharmacy medication list for drug-drug interactions (Problem class: Knowledge synthesis; Evidence: Reported) β rung-2 DDInter skill recipe taking a medication list through per-drug ID resolution β pairwise DDInter queries β a cited severity/mechanism/management table with explicit βcleanβ lines, plus an optional rung-3 DailyMed + ClinPGx overlay on the major pairs. Drug Repurposing and Discovery focus-day recipe; cookbookβs first DDI-screening recipe.
Reportedβ DomiΓ‘n et al., Explor. Res. Clin. Soc. Pharm. 2025 documents that ungrounded LLMs over-flag/hallucinate DDIs (Copilot 1,813 vs a 204-interaction reference on 57 real patients), establishing that screening must be anchored to a curated DDI database β the assembly this recipe recommends. - Run a GWAS on case-control genotype data (Problem class: Data analysis; Evidence: Proposed) β rung-2 PLINK2 skill recipe taking a PLINK/VCF genotype set through sample + variant QC (call rate, MAF, HWE-in-controls) β LD pruning β genotype PCA β PCA-adjusted logistic-regression
--glmassociation with a lambda_GC inflation check, handing genome-wide-significant loci to the GWAS Catalog skill for annotation. Translational Medicine focus-day recipe; cookbookβs first GWAS recipe.Proposedβ no documented LLM-driven PLINK2 assembly; grounded in Chang et al., GigaScience 4:7 (2015) and the canonical QC tutorial (Marees et al., Int. J. Methods Psychiatr. Res. 27:e1608 (2018)). -
Build a pharmacogenomic dosing report from a patientβs diplotypes (Problem class: Knowledge synthesis; Evidence: Proposed) β rung-2 ClinPGx skill recipe taking star-allele diplotypes plus a medication list through diplotypeβmetabolizer-phenotype translation (CPIC PostgREST API) β per-drug CPIC/DPWG dosing recommendation lookup β a cited drug gene phenotype recommendation table, with explicit βno actionable guidanceβ flagging and a DDInter phenoconversion overlay noted. Translational Medicine focus-day recipe; cookbookβs first pharmacogenomic-dosing recipe, distinct from the germline-pathogenicity variant-interpretation recipe. Proposedβ no documented LLM-driven ClinPGx/CPIC assembly; grounded in the CPIC guideline corpus (Amstutz et al., Clin. Pharmacol. Ther. 2018; Molden & JukiΔ, Front. Pharmacol. 2021). - Profile a cancer cohortβs genomics with cBioPortal (Problem class: Knowledge synthesis; Evidence: Reported) β rung-2 cBioPortal skill recipe taking a study + gene set through study/profile lookup β per-gene mutation+CNA alteration frequency and co-occurrence/mutual-exclusivity β TMB summary β a Kaplan-Meier overall-survival split by mutation status, with cohort-denominator caveats enforced. Translational Medicine focus-day recipe; cookbookβs first cohort-level cancer-genomics recipe, cross-linked to the gene-centric target dossier, single-variant variant-interpretation, and adjusted-modelling survival recipe.
Reportedβ the cBioPortal-backed AI-HOPE conversational-agent family documents the assembly class (AI-HOPE-WNT, Front. Artif. Intell. 2025, recapitulating WNT-EOCRC survival p=0.0167/0.0007; AI-HOPE-TP53, Cancers 2025).
Verified (no changes)
- Build a target dossier and Draft a Phase 2/3 clinical-trial protocol β linked catalog tools and key sources re-checked,
last_verifiedbumped to 2026-06-14. - Assemble a tissue reference atlas from the CELLxGENE Census β linked catalog tools (cellxgene-census, scvi-tools, scanpy, anndata) and Census/scvi-hub sources re-checked,
last_verifiedbumped to 2026-06-14.
2026-06-13
Added
- Infer cell-cell communication from single-cell RNA-seq (Problem class: Data analysis; Evidence: Proposed) β rung-2 LIANA-MCP recipe taking an annotated AnnData object through
ls_ccc_methodβ multi-methodcommunicate(CellPhoneDB/Connectome/NATMI/SingleCellSignalR) βrank_aggregateconsensus ligand-receptor tetrads βcircle_plot/ccc_dotplot, consuming the annotated object from the scRNA-seq QC recipe. Molecular and Cellular Biology focus-day recipe; cookbookβs first cell-cell-communication recipe.Proposedβ no documented LLM-driven LIANA-MCP assembly; grounded in Dimitrov et al., Nat. Commun. 13:3735 (2022), a 2026 consensus-LIANA application (Wei et al., PLOS ONE 2026), and the method-disagreement benchmark (Xie et al., Biomolecules 13:1211 (2023)). - Call peaks and find enriched motifs from ChIP-seq or ATAC-seq (Problem class: Data analysis; Evidence: Proposed) β rung-3 toolbelt chaining the MACS3 skill (
callpeak, narrow/broad mode β narrowPeak BED) into the HOMER skill (annotatePeaks.plnearest-gene context +findMotifsGenome.plde-novo/known motif enrichment). Molecular and Cellular Biology focus-day recipe; the binding-site/motif companion to the deepTools signal-profiling recipe, which deliberately stops before peak calling.Proposedβ no documented LLM-driven MACS3βHOMER assembly; grounded in the field-standard pipeline (Zhang et al., Genome Biol. 9:R137 (2008); Heinz et al., Mol. Cell 38:576 (2010)). - Analyze an existing MD trajectory for stability, flexibility, and contacts (Problem class: Data analysis; Evidence: Proposed) β rung-2 MDAnalysis skill recipe taking a finished GROMACS/AMBER/NAMD trajectory through a load-and-sanity-check β aligned RMSD/RMSF/Rg β interface contact map + H-bond occupancy β backbone PCA battery, with the MDTraj skill as the DSSP/Ramachandran fallback. Integrative Structural and Computational Biology focus-day recipe; the post-simulation-analysis companion to the GROMACS setup recipe.
Proposedβ no documented LLM-driven MDAnalysis-skill assembly; grounded in Michaud-Agrawal et al., J. Comput. Chem. 32:2319 (2011), McGibbon et al., Biophys. J. 109:1528 (2015), and class-level agentic-MD evidence (MDCrow, Mach. Learn. Sci. Technol. 2025). - Scan a therapeutic antibody for glycosylation sites (Problem class: Experimental design; Evidence: Proposed) β rung-2 Glycoengineering skill recipe taking heavy/light-chain sequences through N-X-S/T sequon detection (flagging Fc Asn-297 vs unintended variable-domain sites) β O-glycosylation hotspot prediction β a parent-vs-variant sequon diff, with optional minimal site-knockout edit suggestions. Immunology and Microbiology focus-day recipe; cookbookβs first antibody-developability / glycosylation recipe.
Proposedβ no documented LLM-driven glycoengineering-skill assembly; grounded in 2026 Fc-glycan/ADCC literature (Shuang et al., mAbs 2026; IllΓ©s 2026) and the galactosylation-as-CQA reference (Klingler et al., Biotechnol. Bioeng. 2024). - Compute a bacterial pan-genome from a set of genome assemblies (Problem class: Data analysis; Evidence: Proposed) β rung-3 toolbelt chaining the Bakta skill (identical per-genome annotation β GFF3) into the Roary skill (CD-HIT/BLAST/MCL clustering β core/soft-core/shell/cloud partition,
gene_presence_absence.csv, and acore_gene_alignment.alnthat feeds the phylogenetics recipe). Immunology and Microbiology focus-day recipe; cookbookβs first comparative-genomics / pan-genome recipe.Proposedβ no documented LLM-driven BaktaβRoary assembly; grounded in the field-standard pipeline (Page et al., Bioinformatics 2015; Schwengers et al., Microb. Genom. 2021) and a 2025 27,884-genome application (Sholeh et al., Mol. Genet. Genomics 2025).
Verified (no changes)
- 35 recipes spot-checked; all
last_verifieddates within the 30-day window, no aging recipes due.
2026-06-11
Added
- Profile ChIP-seq or ATAC-seq signal around genomic features (Problem class: Data analysis; Evidence: Proposed) β rung-2 deepTools skill recipe taking aligned ChIP-seq/ATAC-seq BAMs through
bamCoverageBPM-normalized bigWig generation βmultiBamSummary+plotCorrelationreplicate QC βcomputeMatrix+plotHeatmap/plotProfileTSS/peak-centered visualization, with upstream BAM handling via the pysam skill. Molecular and Cellular Biology focus-day recipe; cookbookβs first ChIP-seq/ATAC-seq coverage-profiling recipe.Proposedβ no documented LLM-driven deepTools workflow; grounded in RamΓrez et al., NAR 44:W160 (2016) plus class-level Biomni. - Predict gene-knockout phenotypes with flux balance analysis (Problem class: Data analysis; Evidence: Proposed) β rung-2 COBRApy skill recipe taking a genome-scale SBML model through baseline FBA sanity-check β genome-wide
single_gene_deletionessentiality ranking β focuseddouble_gene_deletionsynthetic-lethality screen, with an explicit growth-ratio essentiality threshold. Molecular and Cellular Biology focus-day recipe; cookbookβs first constraint-based metabolic-modelling recipe.Proposedβ no documented LLM-driven COBRApy workflow; grounded in Ebrahim et al., BMC Syst. Biol. 7:74 (2013) and Orth et al., Nat. Biotechnol. 28:245 (2010), plus class-level Biomni.
Verified (no changes)
- 33 recipes spot-checked; all
last_verifieddates within the 30-day window, no aging recipes due.
2026-06-10
Added
- Score point mutations for functional impact with a protein language model (Problem class: Data analysis; Evidence: Proposed) β rung-2 ESM skill recipe taking a wild-type protein sequence (optionally fetched by UniProt accession via the gget skill) and a list of substitutions through masked-marginal log-likelihood-ratio scoring β a ranked tolerated/deleterious CSV, with a wt-marginal one-pass variant for full single-mutation landscapes. Integrative Structural and Computational Biology focus-day recipe; cookbookβs first zero-shot variant-effect / protein-fitness recipe and the database-free complement to the clinical-variant interpretation recipe.
Proposedβ no documented LLM-driven ESM-skill scoring assembly; grounded in the canonical zero-shot method Meier et al., NeurIPS 2021, the ProteinGym benchmark, and 2025 directed-evolution use Zhang et al., Nat. Commun. 2025.
Verified (no changes)
- 31 recipes spot-checked; all
last_verifieddates within the 30-day window, no aging recipes due.
2026-06-09
Added
- Build a phylogenetic tree from a set of sequences (Problem class: Data analysis; Evidence: Proposed) β rung-2 Phylogenetics skill recipe taking a FASTA of homologous sequences (viral genomes, microbial marker genes, protein families) through MAFFT
--autoalignment β gap-column trimming β IQ-TREE 2 ModelFinder + ultrafast-bootstrap maximum-likelihood inference β midpoint/outgroup rooting β an ETE3-annotated tree figure, handing the Newick off to the ETE Toolkit and the 16S diversity recipe (which consumes the rooted tree for UniFrac). Immunology and Microbiology focus-day recipe; cookbookβs first phylogenetics / tree-building recipe.Proposedβ no documented LLM-driven phylogenetics workflow; grounded in the field-standard tool references Katoh & Standley, MBE 30:772 (2013), Minh et al., MBE 37:1530 (2020), Kalyaanamoorthy et al., Nat. Methods 14:587 (2017), and Hoang et al., MBE 35:518 (2018), plus class-level Biomni.
Updated
- Estimate pharmacokinetic properties of a small molecule β promoted
ProposedβReportedon the first field report (issue #12). A user ran the full three-layer assembly through to a finished PK card and captured it in a standalonepk_card.py, verified across caffeine, ibuprofen, quercetin, and terfenadine. Added a Field reports subsection under Evidence and refreshedlast_verifiedto 2026-06-09.
Verified (no changes)
- 3 recipes spot-checked (oldest
last_verifiedfirst), all current;last_verifiedbumped to 2026-06-09: Scan approved drugs for repurposing candidates against a disease, Profile a compoundβs polypharmacology from ChEMBL bioactivity data, Triage an AlphaFold model for structure-based drug design. All linked catalog pages resolve and are unflagged; source DOIs stable.
User requests
- #12 @goodb β resolved. This entry had been stuck open since 2026-05-27 because the responder emitted no machine-readable trailer, so the request content lived only in the GitHub issue body β which the sandboxed curator agent (no
gh/shell) could not read, leaving it βun-actionableβ on every retry. Fixed at the source: therecipes.yml/curate.ymlworkflows now pre-fetch open user-request issue bodies into.request-bodies/<NN>.mdbefore the agent runs, the responder fallback now rebuilds a structured queue entry from the issue-form fields, andRECIPE_AGENT.md/AGENT.mdpoint the agent at the pre-fetched files instead of agh issue viewit canβt run.
2026-06-08
Added
- Identify an unknown compound from an MS/MS spectrum (Problem class: Data analysis; Evidence: Proposed) β rung-2 matchms skill recipe taking experimental tandem-MS spectra plus a reference library (GNPS / MassBank / in-house
.msp) through format import β peak cleaning and metadata harmonization β modified-cosine scoring with precursor-m/z gating β a ranked candidate-identity CSV, handing confirmed InChIKeys off to the PubChem MCP and the polypharmacology recipe. Chemistry focus-day recipe; cookbookβs first metabolomics / spectral-library-matching recipe.Proposedβ no documented LLM-driven matchms workflow; grounded in the canonical library paper Huber et al., JOSS 5(52):2411 (2020) plus methodological anchors Onoprishvili et al., Bioinformatics (2025) (SimMS) and Xing et al., Anal. Chem. (2025) (enhanced reverse spectral search).
Verified (no changes)
- Aging-recipe sweep: oldest
last_verifiedis 2026-05-24 (15 days), within the 30-day window β no recipes due for re-verification this run.
User requests
- #12 (@goodb) β still no
ghpermission to read the issue body from this run; left open for next-run retry.
2026-06-07
Added
- Enumerate analogs around a lead compound for SAR expansion (Problem class: Hypothesis generation; Evidence: Proposed) β rung-2 Datamol skill recipe taking a lead SMILES through standardization β tautomer / stereoisomer enumeration β single-point fragment-substitution scan β ECFP4 Tanimoto + QED scoring β a deduplicated SAR-expansion CSV, with explicit handoff to the VS-hit-filtering developability gate and the polypharmacology bioactivity lookup. Drug Repurposing and Discovery focus-day recipe; cookbookβs first dedicated analog-enumeration / lead-optimisation recipe and the natural upstream of the existing hit-filtering recipe; cookbookβs second
Hypothesis generationrecipe.Proposedβ no documented LLM-driven Datamol enumeration workflow; closest grounding is the K-Dense rdkitβdatamolβmedchem lead-optimisation workflow plus the underlying primitives Rogers & Hahn, JCIM 50:742 (2010) (ECFP/Tanimoto), Bickerton et al., Nat. Chem. 4:90 (2012) (QED), and Griffen et al., J. Med. Chem. 54:7739 (2011) (matched molecular pairs).
Updated
- Nav orders rebalanced to keep alphabetical title ordering after the new addition. βEnumerate analogsβ¦β inserted at 10; everything from βEstimate pharmacokinetic propertiesβ downward shifted +1 (Estimate β 11, Filter VS hits β 12, Infer GRN β 13, Integrate single-cell β 14, Interpret variant β 15, Match patient β 16, Organize DICOM β 17, Parse FCS β 18, Prioritize targets β 19, Profile polypharmacology β 20, Run bulk RNA-seq β 21, Run first-pass QC β 22, Run functional enrichment β 23, Scan repurposing β 24, Set up MD β 25, Sort spikes β 26, Triage preprints β 27, Triage AlphaFold β 28, Fit survival β 29, Scan adverse events β 30).
Verified (no changes)
- 29 existing recipes spot-checked; none past the 30-day
last_verifiedwindow (oldest is 2026-05-24,profile-compound-polypharmacology), so no re-verification was due this run.
2026-06-06
Added
- Fit a survival model to censored clinical outcomes (Problem class: Data analysis; Evidence: Proposed) β rung-2 scikit-survival skill recipe taking a tidy covariate table plus a
(time, event)outcome through structured-Survencoding β Kaplan-Meier + log-rank β Cox PH (with a proportional-hazards check) β Random Survival Forest β cross-validated Harrellβs c-index β risk-group stratification. First Translational Medicine focus-day recipe of this run; cookbookβs first dedicated time-to-event / prognosis recipe.Proposedβ no documented end-to-end LLM-drivensksurvworkflow; closest grounding is the library reference PΓΆlsterl, JMLR 21(212):1β6 (2020) and recent RSF-vs-nomogram prognosis studies Zhang et al., Transl. Cancer Res. (2026) and Liu et al., Medicine (2026). - Scan adverse-event reports for a drug-safety signal (Problem class: Knowledge synthesis; Evidence: Proposed) β rung-2 OpenFDA MCP recipe taking a drug name through generic-name resolution β FAERS top-reaction ranking β structured label / warning pull β label-vs-FAERS cross-check β an honest βreports, not ratesβ framing. Second Translational Medicine focus-day recipe of this run; promoted from the
Deferred β next-run prioritylist; cookbookβs first pharmacovigilance recipe.Proposedβ no documented attempt of this exact MCP assembly; openFDA/FAERS is the canonical public pharmacovigilance source and the server wraps it faithfully.
Verified (no changes)
- 27 existing recipes spot-checked; none past the 30-day
last_verifiedwindow (oldest is 2026-05-24), so no re-verification was due this run.
2026-06-05
Added
- Organize a raw DICOM dataset into a BIDS layout (Problem class: Workflow automation; Evidence: Proposed) β rung-2 BIDS Claude Skill recipe taking a directory of vendor DICOMs through series-level inventory β HeuDiConv heuristic (or dcm2bids config) drafting β single-subject
--dry-runaudit β cohort conversion viadcm2niixβ top-leveldataset_description.json/participants.tsv/ sidecar authoring βbids-validatortriage β PyBIDS post-conversion query, with explicitIntendedForcross-link logic for fieldmaps. First Neuroscience focus-day recipe of this run; promoted from theDeferred β next-run prioritylist. Cookbookβs first imaging-side data-organization recipe β counterpart to the existing Discover NWB recordings on DANDI electrophysiology discovery recipe.Proposedbecause no documented end-to-end LLM-driven DICOMβBIDS workflow exists in last-24-months peer-reviewed or preprint literature; closest component-level grounding is Gorgolewski et al., Sci. Data 3:160044 (2016) and Poldrack et al., Imaging Neuroscience 2:1β19 (2024) (BIDS spec evolution); Yarkoni et al., JOSS 4(40):1294 (2019) (PyBIDS); Zwiers, Moia, Oostenveld, Front. Neuroinform. 15:770608 (2022) (BIDScoin); and Wulms et al., Sci. Data 10:673 (2023) (BIDSconvertR).
Updated
- Nav orders rebalanced to keep alphabetical title ordering after the new addition and to fix a stale collision between Run first-pass QC and Run functional enrichment (both stamped 20). βOrganize a raw DICOM datasetβ¦β inserted at 16; everything from βParse FCSβ¦β downward shifted by +1, with Run first-pass QC at 21 and Run functional enrichment at 22: Parse FCS flow-cytometry files β 17, Prioritize targets β 18, Profile polypharmacology β 19, Run bulk RNA-seq DE β 20, Run first-pass QC β 21, Run functional enrichment β 22, Scan repurposing β 23, Set up protein MD β 24, Sort spikes β 25, Triage preprints β 26, Triage AlphaFold β 27.
Verified (no changes)
- No aging recipes due β every
last_verifieddate is within the 30-day window. The verification floor sits at 2026-05-24 (scan-drug-repurposing-candidates); next aging boundary is 2026-06-23.
User requests
- #12 @goodb β still cannot access the issue body (no
ghpermission for the repo in this run); leaving open inrecipes/curator-state.mdfor the next run withghaccess.
2026-06-04
Added
- Run functional enrichment on a gene list (Problem class: Data analysis; Evidence: Reported) β rung-2 gget skill recipe taking a list of gene symbols through
gget enrichragainst GO BP, KEGG, Reactome, MSigDB Hallmark, and DisGeNET β per-library CSV β grounded natural-language summary with explicit verification pass against the saved tables and a random-gene negative-control step. First Molecular and Cellular Biology focus-day recipe of this run; the cookbookβs first dedicated functional-enrichment / pathway-interpretation recipe and the natural downstream step after bulk RNA-seq DE.Reportedevidence anchored in Wang et al., GeneAgent, Nature Methods 22:1677, 2025 β self-verification against Enrichr and curated databases lifts ROUGE-L on MSigDB from 0.239Β±0.038 (GPT-4) to 0.310Β±0.047 (GeneAgent) across 1,106 gene sets, with 84% of 15,848 claims database-supported and 92% of self-verification decisions correct on a 132-claim expert-judged sample; complementary anchors Hu et al., Nat. Methods 21:2353, 2024 and Joshi et al., llm2geneset (bioRxiv 2024-11-12).
Verified (no changes)
- 5 recipes spot-checked,
last_verifiedbumped to 2026-06-04 β every linked catalog page resolves, every source URL still loads: Sort spikes from a Neuropixels recording end-to-end, Integrate multiple single-cell RNA-seq datasets across batches, Interpret a clinical variant from a natural-language query, Match a patient summary to recruiting clinical trials, Filter a virtual screening hit list with drug-likeness rules and structural alerts. Fixed one stale.mdlink β.htmlin the filter-virtual-screening recipe (RDKit-MCP cross-reference).
User requests
- #12 @goodb β still cannot access the issue body (no
ghpermission in this run); leaving open inrecipes/curator-state.mdfor the next run withghaccess.
2026-06-03
Added
- Dock a ligand library into a target structure with DiffDock (Problem class: Data analysis; Evidence: Proposed) β rung-2 DiffDock skill recipe taking a PDB or AlphaFold target + ligand SMILES CSV through batch-CSV prep β diffusion sampling (20β40 samples/complex) β confidence-thresholded filtering (
> 0trustworthy, β1.5β0 inspect, < β1.5 drop) β top-K SDF export, with explicit handoffs to MedChem / DeepChem / molecular-dynamics downstream. First Integrative Structural and Computational Biology focus-day recipe of this run; cookbookβs first dedicated docking recipe and natural downstream of the existing AlphaFold triage recipe.Proposedbecause no documented end-to-end LLM-orchestrated DiffDock virtual screen exists; closest component-level evidence is Corso et al., DiffDock-L (ICLR 2024, arXiv:2402.18396) (38%β80% RMSD<2Γ on top one-third by confidence), Buttenschoen et al., PoseBusters (Chem. Sci. 15:3130, 2024), and Karelina et al., AF2-target docking (JCIM 63:6219, 2023) (~21% RMSD<2Γ on AF2 models, motivating the upstream-triage gate in step 2).
Updated
- Nav orders rebalanced to keep alphabetical title ordering after the new addition. βDock a ligand libraryβ¦β inserted at 8; everything from βDraft Phase 2/3β¦β downward shifted by +1: Draft Phase 2/3 clinical-trial protocol β 9, Estimate PK β 10, Filter virtual screening β 11, Infer GRN β 12, Integrate single-cell β 13, Interpret clinical variant β 14, Match patient to trials β 15, Parse FCS flow-cytometry files β 16, Prioritize targets β 17, Profile polypharmacology β 18, Run bulk RNA-seq DE β 19, Run first-pass QC β 20, Scan repurposing β 21, Set up protein MD β 22, Sort spikes β 23, Triage preprints β 24, Triage AlphaFold β 25.
Verified (no changes)
- No aging recipes due β every
last_verifieddate is within the 30-day window. The recipe setβs verification floor sits at 2026-05-22 (integrate-single-cell-datasets,sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.
User requests
- #12 @goodb β still cannot access the issue body (no
ghpermission for the repo in this run); leaving the request open inrecipes/curator-state.mdfor the next run withghaccess.
2026-06-02
Added
- Compute 16S microbiome alpha/beta diversity from a BIOM table (Problem class: Data analysis; Evidence: Proposed) β rung-2 scikit-bio skill recipe taking a BIOM feature table + sample metadata + Newick tree through rarefaction β Shannon/Simpson/Faithβs PD β weighted/unweighted UniFrac β PCoA β PERMANOVA with explicit grouping-column and permutation-count flags. First Immunology and Microbiology focus-day recipe of this run; cookbookβs first dedicated microbiome / community-ecology recipe.
Proposedbecause no documented end-to-end attempt of this exact assembly exists; closest class-level evidence is Huang et al. Biomni (bioRxiv 2025.05.30.656746) whose published benchmark includes microbiome disease-taxa bioinformatics across five datasets (HMP, MetaPhlAn2 human metagenomics, drinking-water OTU matrices) at ~4Γ over base-LLM accuracy. - Parse FCS flow-cytometry files for downstream immunophenotyping (Problem class: Data analysis; Evidence: Proposed) β rung-2 FlowIO skill recipe taking a directory of vendor-emitted FCS 2.0/3.0/3.1 files through
FlowDataparsing β per-file metadata harvest β scatter/fluorescence/time channel categorisation β optional log/gain transforms β concatenated long-format events Parquet, with explicit failure surfacing for partial-acquisition files. Second Immunology and Microbiology focus-day recipe; cookbookβs first cytometry / FCS recipe.Proposedbecause no documented end-to-end attempt of this exact assembly exists; closest class-level evidence is βEnhancing Clinical Workflow Efficiency in Flow Cytometry Reporting with LLMsβ (PMC13053331, J. Clin. Immunol. 2026), which demonstrates pathologist-level accuracy of fine-tuned LLMs on the downstream report-generation step the parsed-events output feeds into.
Updated
- Nav orders rebalanced to keep alphabetical title ordering after the two additions: Assemble Census atlas β 1, Benchmark ADMET β 2, Build target dossier β 3, Compute 16S microbiome diversity β 4 (new), Compute HRV β 5, Convert instrument data β 6, Discover NWB on DANDI β 7, Draft Phase 2/3 clinical-trial protocol β 8, Estimate PK β 9, Filter virtual screening β 10, Infer GRN β 11, Integrate single-cell β 12, Interpret clinical variant β 13, Match patient to trials β 14, Parse FCS flow-cytometry files β 15 (new), Prioritize targets β 16, Profile polypharmacology β 17, Run bulk RNA-seq DE β 18, Run first-pass QC β 19, Scan repurposing β 20, Set up protein MD β 21, Sort spikes β 22, Triage preprints β 23, Triage AlphaFold β 24.
Verified (no changes)
- No aging recipes due β every
last_verifieddate is within the 30-day window. The recipe setβs verification floor sits at 2026-05-22 (integrate-single-cell-datasets,sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.
User requests
- #12 (
claude:recipe-feedback) β remains in## User requests (open);ghCLI is still not available in this runβs environment so the issue body cannot be inspected. Retry next run withghaccess.
2026-06-01
Added
- Convert raw analytical instrument data to Allotrope ASM JSON (Problem class: Workflow automation; Evidence: Reported) β rung-2 instrument-data-to-allotrope skill recipe taking a vendor-format file (cell counter, plate reader, HPLC, MS, qPCR) through auto-detect β
allotropynative parse β ASM JSON-LD + flattened CSV + exportable Python parser, with strict-validation of the raw-vs-derived split before LIMS / data-lake handoff. First Chemistry focus-day recipe of this run; cookbookβs first workflow-automation recipe spanning the Anthropic life-sciences plugin family. Anchored in the Claude for Life Sciences launch (October 2025), the Anthropic Vi-CELL tutorial, and the underlyingBenchling-Open-Source/allotropyreference parser. - Set up a protein molecular dynamics simulation in GROMACS from a PDB ID (Problem class: Experimental design; Evidence: Proposed) β rung-2 molecule-mcp recipe driving the GROMACS Copilot server end-to-end (topology β solvation β ion neutralisation β minimisation β NVT/NPT β 50 ns production β RMSD/RMSF/Rg) with explicit force-field / water-model / GPU-offload flags. Second Chemistry focus-day recipe; first cookbook entry exercising the GROMACS path of the molecule-mcp bundle.
Proposedbecause no documented end-to-end attempt of this exact assembly exists; closest peer-reviewed class-level evidence is MDCrow (Campbell et al., Mach. Learn. Sci. Technol. 2025, DOI:10.1088/2632-2153/ae4b07) β OpenMM rather than GROMACS but same architecture β plus GROMACS-supporting follow-ons DynaMate (arXiv:2512.10034) and NAMD-Agent (arXiv:2507.07887), and the MDGym benchmark (arXiv:2605.08941) as a reality check (Claude Code / Codex / OpenHands all solve <21% of easy GROMACS/LAMMPS tasks).
Updated
- Nav orders rebalanced to restore strict alphabetical title ordering after the two additions and to correct two prior off-by-many drifts (Benchmark ADMET was at 20 instead of 2; Prioritize Targets was at 19 instead of 14): Assemble Census atlas β 1, Benchmark ADMET β 2, Build target dossier β 3, Compute HRV β 4, Convert instrument data β 5 (new), Discover NWB on DANDI β 6, Draft a Phase 2/3 clinical-trial protocol β 7, Estimate PK β 8, Filter virtual screening β 9, Infer GRN β 10, Integrate single-cell β 11, Interpret clinical variant β 12, Match patient to trials β 13, Prioritize targets β 14, Profile polypharmacology β 15, Run bulk RNA-seq DE β 16, QC single-cell β 17, Scan repurposing β 18, Set up protein MD in GROMACS β 19 (new), Sort spikes β 20, Triage preprints β 21, Triage AlphaFold β 22.
recipes/curator-state.mdβ## Missing componentsentry for βDeepChem (K-Dense Skill)β removed; DeepChem is now catalogued atcatalog/tools/deepchem.md.
Verified (no changes)
- No aging recipes due β every
last_verifieddate is within the 30-day window. The recipe setβs verification floor sits at 2026-05-22 (integrate-single-cell-datasets,sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.
User requests
- #12 (
claude:recipe-feedback) β remains in## User requests (open);ghCLI is still not available in this runβs environment so the issue body cannot be inspected. Retry next run withghaccess.
2026-05-31
Added
- Prioritize targets within a disease via Open Targets (Problem class: Knowledge synthesis; Evidence: Reported) β rung-2 Open Targets plugin recipe taking a disease (EFO/MONDO) to a ranked target shortlist across the four prioritisation pillars (precedence, tractability, doability, safety) with cited GraphQL fields per cell. First DR&D focus-day recipe of this run; complements the existing gene-in Build a target dossier and disease-in/drug-out Scan approved drugs for repurposing candidates recipes. Anchored in Buniello et al. NAR 53(D1):D1467βD1475 (2025) and Minikel et al. Nature 629:624β629 (2024); closest LLM-driven application: Zunzunegui Sanz et al. bioRxiv 2025-06-13 and More et al. npj Precision Oncology 10:95 (2025).
- Benchmark an ADMET property with PyTDC (Problem class: Data analysis; Evidence: Reported) β rung-2 PyTDC skill recipe driving the official TDC
ADMET_Groupbenchmark (frozen scaffold splits, canonical metric per task, 5-seed leaderboard row format) so a new model gets a directly comparable number. Second DR&D focus-day recipe; first cookbook entry that produces leaderboard-comparable ADMET metrics. Anchored in Huang et al. NeurIPS Datasets and Benchmarks (2021), the published TDC-2 framework Velez-Arce et al. NeurIPS 2024, and recent LLM-driven workflows (Hao et al. Scientific Data 11:864 (2024); Yuan et al. arXiv:2406.06316 (2024)).
Verified (no changes)
- No aging recipes due β every
last_verifieddate is within the 30-day window. The recipe setβs verification floor sits at 2026-05-22 (integrate-single-cell-datasets,sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.
User requests
- #12 (
claude:recipe-feedback) β remains in## User requests (open);ghCLI is still not available in this runβs environment so the issue body cannot be inspected. Retry next run withghaccess.
2026-05-30
Added
- Draft a Phase 2/3 clinical-trial protocol from an indication brief (Problem class: Manuscript prep; Evidence: Reported) β rung-2
clinical-trial-protocolAnthropic Healthcare plugin recipe that walks an indication / endpoint paragraph through the four-waypoint flow β regulatory classification, ClinicalTrials.gov competitive landscape, sample-size calculation, FDA/NIH-template drafting β emerging with a reviewable draft Phase 2/3 protocol scaffold. First Translational Medicine focus-day recipe of the new run; resolves a previously deferred candidate. Evidence anchored in the Anthropic plugin tutorial (Claude for Healthcare launch, January 2026) and class-level validation in Markey et al. Clinical Trials 2025 (80% content relevance, >99% terminology accuracy with RAG), Shin et al. Clinical Pharmacology & Therapeutics 2026 (100% accuracy on disease/intervention/comparator extraction, 14/15 trials for sample-size identification), Hauptman et al. JMIR Dermatology 2026, and Maleki, arXiv 2404.05044 (2024).
Updated
- Nav orders rebalanced across the recipe set to keep alphabetical ordering after the addition: Assemble Census atlas β 1, Build target dossier β 2, Compute HRV β 3, Discover NWB on DANDI β 4, Draft a Phase 2/3 clinical-trial protocol β 5 (new), Estimate PK β 6, Filter virtual screening β 7, Infer GRN β 8, Integrate single-cell β 9, Interpret clinical variant β 10, Match patient to trials β 11, Profile polypharmacology β 12, Run bulk RNA-seq DE β 13, QC single-cell β 14, Scan repurposing β 15, Sort spikes β 16, Triage preprints β 17, Triage AlphaFold β 18.
Verified (no changes)
- No aging recipes due β every
last_verifieddate is within the 30-day window. The recipe setβs verification floor sits at 2026-05-22 (integrate-single-cell-datasets,sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.
User requests
- #12 (
claude:recipe-feedback) β remains in## User requests (open);ghCLI is still not available in this runβs environment so the issue body cannot be inspected. Retry next run withghaccess.
2026-05-29 (second pass β Neuroscience directed)
Added
- Discover NWB recordings on DANDI and prepare them for sorting (Problem class: Knowledge synthesis; Evidence: Reported) β rung-3 Neurosift Tools MCP + neuropixels-analysis skill toolbelt taking a semantic query about extracellular recordings to a filtered list of DANDI assets β Claude calls
dandi_semantic_search,dandi_search_by_neurodata_type,dandiset_assets, andnwb_file_infoover the public DANDI API, applies user-supplied hypothesis constraints (probe model, session duration, presence of aUnitstable), and emitsdandi download/pynwbstreaming snippets ready for the Sort spikes from a Neuropixels recording recipe. Third Neuroscience-primary recipe; resolves a previously deferred candidate. Evidence anchored in Magland, Ly, RΓΌbel, Dichter. Scientific Data 12:1988 (2025), doi:10.1038/s41597-025-06285-x, which documents an LLM-driven agentic chat assistant and notebook-generation pipeline for DANDI exploration from the same Flatiron lab that ships the Neurosift Tools MCP; reviewed by neurophysiology specialists with most generated notebooks rated βvery helpful.β Canonical Neurosift citation: Magland, Soules, Baker, Dichter. JOSS 9(97):6590 (2024), doi:10.21105/joss.06590.
Updated
- Nav orders rebalanced across the recipe set to keep alphabetical ordering after the addition: Assemble Census atlas β 1, Build target dossier β 2, Compute HRV β 3, Discover NWB on DANDI β 4, Estimate PK β 5, Filter virtual screening β 6, Infer GRN β 7, Integrate single-cell β 8, Interpret clinical variant β 9, Match patient to trials β 10, Profile polypharmacology β 11, Run bulk RNA-seq DE β 12, QC single-cell β 13, Scan repurposing β 14, Sort spikes β 15, Triage preprints β 16, Triage AlphaFold β 17.
Verified (no changes)
- No aging recipes this run β every
last_verifieddate is within the 30-day window. The recipe setβs verification floor sits at 2026-05-22 (integrate-single-cell-datasets,sort-spikes-from-neuropixels-recording); next aging boundary is 2026-06-21.
User requests
- #12 (
claude:recipe-feedback) β remains in## User requests (open);ghCLI still unavailable in this runβs environment so the issue body cannot be inspected. Retry next run withghaccess.
2026-05-29
Added
- Compute HRV from an ECG recording (Problem class: Data analysis; Evidence: Proposed) β rung-2 NeuroKit2 Claude skill recipe taking a single-lead ECG to validated R-peaks plus time-domain, frequency-domain, and non-linear HRV indices, with
nk.signal_quality-driven epoch exclusion. Second Neuroscience-primary recipe in the cookbook (joins the Neuropixels spike-sorting recipe). Component evidence: Makowski et al. Behavior Research Methods 2021 (NeuroKit2 reference) and Pham et al. Sensors 2021 (HRV indices tutorial). Closest LLM-orchestrated analogue: EEGAgent (Yan et al., arXiv:2511.09947, 2025-11-12), AAAI-26 β different signal modality and custom toolbox, not NeuroKit2.
Updated
- Nav orders rebalanced across the recipe set to keep alphabetical ordering after the addition: Assemble Census atlas β 1, Build target dossier β 2, Compute HRV β 3, Estimate PK β 4, Filter virtual screening β 5, Infer GRN β 6, Integrate single-cell β 7, Interpret clinical variant β 8, Match patient to trials β 9, Profile polypharmacology β 10, Run bulk RNA-seq DE β 11, QC single-cell β 12, Scan repurposing β 13, Sort spikes β 14, Triage preprints β 15, Triage AlphaFold β 16.
Verified (no changes)
- 4 recipes spot-checked at the 30-day boundary and bumped to
last_verified: 2026-05-29β Triage preprints, QC single-cell, Build target dossier, Run bulk RNA-seq DE. All linked catalog tools (bio-research, pubmed, single-cell-rna-qc, pydeseq2, open-targets, uniprot, alphafold, depmap) remain present and unflagged.
User requests
- #12 (
claude:recipe-feedback) β remains in## User requests (open);ghCLI is not available in this runβs environment so the issue body still cannot be inspected. Retry on the next run that hasghaccess.
2026-05-28
Added
- Assemble a tissue reference atlas from the CELLxGENE Census (Problem class: Data analysis; Evidence: Reported) β rung-2 cellxgene-census skill recipe pulling a versioned AnnData slice from the CZ CELLxGENE Discover Census with the CZ-trained scVI embedding attached for reference mapping. First Molecular and Cellular Biology focus-day recipe to consume the Census. Evidence anchored in the Census teamβs
comp_bio_data_integration_scvinotebook, the scvi-hub paper (Ergen et al., Nature Methods 2025), and the integrated human lung atlas (Sikkema et al., Nature Medicine 2023). - Infer a gene-regulatory network from single-cell RNA-seq (Problem class: Data analysis; Evidence: Reported) β rung-2 Arboreto skill recipe running GRNBoost2 on a QCβd / integrated AnnData with a TF-restricted regressor and seed-stabilised reruns; produces the ranked TFβtarget edge table that pySCENIC consumes downstream. Evidence anchored in Moerman et al. Bioinformatics 2019 (GRNBoost2), Van de Sande et al. Nature Protocols 2020 (SCENIC workflow), and Bravo GonzΓ‘lez-Blas et al. Nature Methods 2023 (SCENIC+).
Updated
- Nav orders rebalanced across the recipe set to keep alphabetical ordering after the two additions: Assemble Census atlas β 1, Build target dossier β 2, Estimate PK β 3, Filter virtual screening β 4, Infer GRN β 5, Integrate single-cell β 6, Interpret clinical variant β 7, Match patient to trials β 8, Profile polypharmacology β 9, Run bulk RNA-seq DE β 10, QC single-cell β 11, Scan repurposing β 12, Sort spikes β 13, Triage preprints β 14, Triage AlphaFold β 15.
Missing components flagged to the catalog curator
- pySCENIC wrapper (cisTarget + AUCell) β would unlock the full SCENIC pipeline downstream of the new GRN-inference recipe (motif filtering against cisTarget databases, per-cell regulon AUCell scoring).
Verified (no changes)
- All 13 pre-existing recipes have
last_verifiedwithin the 30-day window (oldest 2026-05-21); no aging verifications were due this run.
2026-05-27
Added
- Estimate pharmacokinetic properties of a small molecule (Problem class: Knowledge synthesis; Evidence: Proposed) β rung-3 RDKit + MedChem + ChEMBL assembly producing a descriptor / rule-based / analog-anchored PK card for a single SMILES. Ships in response to user request #8. Closest documented analogues: ChemCrow (Bran et al., Nature Machine Intelligence 2024) and PharmaBench (Niu et al., Scientific Data 2024).
- Triage an AlphaFold model for structure-based drug design (Problem class: Knowledge synthesis; Evidence: Proposed) β rung-2 AlphaFold MCP recipe producing a pLDDT-anchored go/refine/fall-back-to-PDB verdict on a UniProt accession. First Integrative Structural and Computational Biology-primary recipe. Evidence grounded in the EBI AlphaFold DB papers (Varadi 2022, Varadi 2024), the interface-pLDDT benchmark (Bryant 2022), and the AlphaFold-for-docking assessment (Karelina 2023).
Updated
- Nav orders rebalanced across the recipe set to keep alphabetical ordering after the two additions: Estimate PK properties β 2, Filter virtual screening hits β 3, Integrate single-cell datasets β 4, Interpret clinical variant β 5, Match patient to trials β 6, Profile polypharmacology β 7, Run bulk RNA-seq DE β 8, QC single-cell RNA-seq β 9, Scan repurposing candidates β 10, Sort spikes β 11, Triage preprints β 12, Triage AlphaFold model β 13.
Missing components flagged to the catalog curator
- ADMET-AI / AdmetLab 3.0 / Deep-PK wrapper β would let the new PK-properties recipe move from descriptor-and-analog estimation to defensible ML prediction for CYP / hERG / microsomal endpoints.
- DeepChem (K-Dense Skill) β already flagged in the catalog curatorβs state; would also strengthen the PK-properties recipe.
- Co-folding / AlphaFold-Multimer / Boltz-2 wrapper β would unlock a complex-modelling companion to the AlphaFold triage recipe.
Verified (no changes)
- All recipes have
last_verifiedwithin the 30-day window; no aging verifications were due this run.
2026-05-25
Added
- Filter a virtual screening hit list with drug-likeness rules and structural alerts (Problem class: Data analysis; Evidence: Reported) β rung-2 MedChem + Datamol cascade for Lipinski β Veber β PAINS β BRENK triage of SMILES hit lists. First Chemistry-primary recipe in the cookbook. Evidence anchored in the K-Dense lead-optimisation workflow and the foundational filter papers (Baell & Holloway PAINS 2010, Brenk 2008, Lipinski 2001, Veber 2002).
- Profile a compoundβs polypharmacology from ChEMBL bioactivity data (Problem class: Knowledge synthesis; Evidence: Reported) β rung-2 single-tool recipe over the ChEMBL connector. Second Chemistry-primary recipe and the compound-centric mirror of the existing target-dossier recipe. Evidence grounded in the Anthropic ChEMBL Connector tutorial and the ChEMBL curation paper (Mendez et al., NAR 2019).
Updated
- Integrate multiple single-cell RNA-seq datasets across batches β nav_order 2 β 3 for alphabetical position after the new Filter recipe.
- Interpret a clinical variant from a natural-language query β nav_order 3 β 4.
- Match a patient summary to recruiting clinical trials β nav_order 4 β 5.
- Run bulk RNA-seq differential expression from a counts matrix β nav_order 5 β 7 (after the new Profile recipe).
- Run first-pass QC on a single-cell RNA-seq dataset β nav_order 6 β 8.
- Scan approved drugs for repurposing candidates against a disease β nav_order 7 β 9.
- Sort spikes from a Neuropixels recording end-to-end β nav_order 8 β 10.
- Triage a stack of new preprints in your field β nav_order 9 β 11.
Verified (no changes)
- 9 existing recipes spot-checked; all
last_verifieddates within the 30-day window, all linked catalog pages resolve.
2026-05-24
Added
- Scan approved drugs for repurposing candidates against a disease (Problem class: Knowledge synthesis; Evidence: Proposed) β rung-3 toolbelt composing the Open Targets plugin, ChEMBL connector, and DrugBank MCP; first focused Drug Repurposing and Discovery recipe in the cookbook. Evidence anchors: DeepDrug Alzheimerβs repurposing graph (Li et al., Scientific Reports 2025), Robin / ripasudil dAMD discovery (Ghareeb et al., Nature 2026), and DREBIOP LLM-validation benchmark (Zunzunegui Sanz et al., bioRxiv 2025-06-13).
Updated
- Sort spikes from a Neuropixels recording end-to-end β nav_order 7 β 8 for alphabetical position.
- Triage a stack of new preprints in your field β nav_order 8 β 9 for alphabetical position.
Verified (no changes)
- 8 existing recipes spot-checked; all
last_verifieddates within the 30-day window, all linked catalog pages resolve.
2026-05-23
Added
- Match a patient summary to recruiting clinical trials (Problem class: Knowledge synthesis; Evidence: Reported) β rung-2 BioMCP / cyanheads-ClinicalTrials.gov-MCP recipe; first Translational-Medicine-focused recipe in the cookbook. Evidence grounded in TrialGPT (Jin et al., Nature Communications 2024, 87.3% criterion-matching accuracy).
- Interpret a clinical variant from a natural-language query (Problem class: Knowledge synthesis; Evidence: Proposed) β rung-2 BioMCP recipe; pairs with the trial-matching recipe for variant-driven enrollment. Closest analogous benchmark is MARRVEL-MCP (bioRxiv 2025-11).
Updated
- Run bulk RNA-seq differential expression from a counts matrix β nav_order 3 β 5 for alphabetical position after the two new TM recipes.
- Run first-pass QC on a single-cell RNA-seq dataset β nav_order 4 β 6 for alphabetical position.
- Sort spikes from a Neuropixels recording end-to-end β nav_order 5 β 7 for alphabetical position.
- Triage a stack of new preprints in your field β nav_order 6 β 8 for alphabetical position.
Verified (no changes)
- 5 existing recipes spot-checked; all
last_verifieddates within the 30-day window, all linked catalog pages resolve.
2026-05-22
Added
- Integrate multiple single-cell RNA-seq datasets across batches (Problem class: Data analysis; Evidence: Reported) β rung-2 recipe wrapping the Anthropic
scvi-toolsskill for scVI / scANVI batch integration; written in response to user request #7; evidence grounded in Hrovatin 2025 and scIB-E 2025 (source). - Sort spikes from a Neuropixels recording end-to-end (Problem class: Data analysis; Evidence: Reported) β rung-2 recipe wrapping the K-Dense
neuropixels-analysisskill (SpikeInterface + Kilosort4); first Neuroscience-only recipe in the cookbook (source).
Updated
- Run bulk RNA-seq differential expression from a counts matrix β nav_order shifted 2 β 3 for alphabetical position.
- Run first-pass QC on a single-cell RNA-seq dataset β nav_order shifted 3 β 4 for alphabetical position.
- Triage a stack of new preprints in your field β nav_order shifted 4 β 6 for alphabetical position.
Verified (no changes)
- 4 existing recipes spot-checked (all linked catalog pages resolve;
last_verified2026-05-21 still within the 30-day window so no bumps).
2026-05-21
Added
- Run first-pass QC on a single-cell RNA-seq dataset (Problem class: Data analysis; Evidence: Reported) β rung-2 recipe wrapping Anthropicβs
single-cell-rna-qcskill for canonical scverse MAD-based filtering of 10x.h5/ AnnData.h5adinputs (source). - Run bulk RNA-seq differential expression from a counts matrix (Problem class: Data analysis; Evidence: Reported) β rung-2 recipe wrapping the K-Dense PyDESeq2 skill for negative-binomial GLM differential expression, including pseudobulk single-cell handoff guidance (source).
- Build a target dossier from gene name to structure to cancer dependency (Problem class: Knowledge synthesis; Evidence: Proposed) β first rung-3 toolbelt recipe composing Open Targets, UniProt, AlphaFold, and DepMap into a one-page target dossier; first
Proposed-evidence entry in the cookbook (closest analogue).
Updated
- Triage a stack of new preprints in your field β nav_order shifted from 1 to 4 to reflect alphabetical ordering after the three new Mol/Cell Bio additions; no content changes.
Verified (no changes)
- 1 recipe spot-checked, current (
triage-new-preprints, last_verified 2026-05-21).
2026-05-21 (initial seed)
Added
- Section bootstrap β
recipes/section created with landing page, landscape page, and the all-recipes index;recipes/curator-state.mdinitialized;RECIPES_CHANGELOG.md(this file) created. Curator prompt and daily workflow added atRECIPE_AGENT.mdand.github/workflows/recipes.yml. - Triage a stack of new preprints in your field (Problem class: Literature triage; Evidence: Reported) β first seed recipe demonstrating the schema and the lowest rung of the simplicity ladder (Claude Code alone + bioRxiv MCP) (source).