CVEvolve

Autonomous agentic harness from Argonne’s Advanced Photon Source that discovers and refines scientific data-processing algorithms for unstructured experimental images via zero-code, lineage-aware search with optional visual inspection of intermediate outputs.

   
Affiliation Advanced Photon Source, Argonne National Laboratory (paper)
First introduced 2026-05 (arXiv:2605.11359, dated 2026-05-12)
Lifecycle stages Analysis — autonomous discovery of executable image-processing and feature-detection algorithms for downstream scientific interpretation
Autonomy level Semi-autonomous — the user supplies a task description, a data directory, and optional metric hints; CVEvolve runs an open-ended search to a fixed round budget with optional holdout testing
Domain focus Unstructured scientific imaging at synchrotron beamlines: image registration, peak detection, segmentation
Availability Unknown — no public repository disclosed in the preprint

Approach

A controller wraps an LLM agent (Claude Opus 4.6 in all reported cases) that uses code-execution, evaluation, history, image-rendering, and web-search tools. Work is organized into three stages.

  • Preparation stage. The agent inspects task data, examines representative images, fixes the primary optimization metric from the task description or user hints, and builds a minimal evaluation harness for later rounds. It can construct and manage its own local runtime (e.g., via uv) including dependency installation.
  • Baseline stage. User-provided or agent-suggested baseline algorithms are evaluated; results are stored in a persistent SQL search-state database so later rounds avoid redundant reevaluation.
  • Algorithm development stage. Each round picks one of generate (broad exploration), tune (exploitative refinement of a strong parent), or evolve (crossover of two parents). Branching is history-driven, with periodic forced generate rounds to preserve exploration. Parents are sampled with lineage-aware stochastic sampling inspired by MAP-Elites: a Gibbs distribution p_i ∝ exp(−(r_i − 1)/τ) over ranks, with a same-lineage penalty λ^m_i reducing the weight of candidates from already-selected lineages during evolve crossover. The agent starts each round with a fresh context (only the system and task prompts) to control context size.

CVEvolve exposes file-system tools, environment management/execution tools, an image viewer that handles floating-point/TIFF data with percentile dynamic-range selection and log scaling, search-state tools backed by a SQL database, and web-search tools (arXiv, Semantic Scholar, Tavily). An image-follow-up middleware injects rendered images back into the conversation when a tool returns an image path. An optional holdout test runs in a separate temporary workspace handled by a dedicated agent so the development agent never sees holdout data.

Validation

Three case studies on real synchrotron data: (1) x-ray fluorescence microscopy translational image registration, (2) Bragg peak detection in x-ray diffraction images, (3) polycrystalline diffraction image segmentation. Run for 20–40 rounds with development and holdout sets; comparisons against task-appropriate baselines (brute-force error minimization, phase correlation) and against OpenEvolve (an open-source AlphaEvolve implementation) on the registration task.

Notable results

  • XRF image registration. Best candidate average Euclidean error 0.12 on the holdout set vs. 0.98 (brute force), 5.59 (phase correlation), and 0.23 (OpenEvolve at 500 iterations) — a roughly eightfold reduction over the best baseline.
  • Bragg peak detection. Holdout F1 lifted from 0.298 (baseline) to 0.788 for the round-5 candidate, with precision improving from 0.237 → 0.839 and recall from 0.400 → 0.743. The holdout-test agent surfaced over-optimization that began after round 9 on the development image, demonstrating the value of holdout monitoring on small development sets.
  • Diffraction image segmentation. Discovered a workflow combining radial-background subtraction, multi-pass peak detection (LoG / connected components / proximity-based recovery / SNR maxima), and prominence/shape validation; weighted IoU improves substantially over the baseline thresholding approach over 40 rounds.

Primary paper

Du et al., “CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing,” arXiv:2605.11359.

Other references

None yet.

Code

Not released at preprint time.