Lead · RESEARCH · 6 min

Cross-Component Interference Undermines LLM Agent Stacking

Researchers studying LLM agent scaffolding have found that stacking more components does not reliably improve performance, a phenomenon they term cross-component interference. In controlled experiments across 32 component subsets on HotpotQA and GSM8K using Llama-3.1-8B and 70B, a single-tool agent outperformed a fully equipped system by 32 percent on HotpotQA, while a three-component subset beat the all-inclusive configuration by 79 percent on GSM8K.

Cross-Component Interference Undermines LLM Agent Stacking

Research

Papers, benchmarks, and what we'd build differently after reading them.

All Research →
Research · May 18 · 5 min

LATTE Framework Uses Task Graphs to Coordinate LLM Teams

Research · May 18 · 6 min

NeuroAgent Automates Multimodal Neuroimaging Pipelines

Research · May 14 · 7 min

CIVeX Verifier Targets Causal Gaps in Tool-Using Agents

Research · May 14 · 6 min

RAO Trains Agents to Delegate Sub-Tasks to Themselves

Tutorials

Code-first dispatches. Most run in under fifteen minutes.

All Tutorials →
Multimodal Retrieval Pipeline with LlamaIndex AgentMesh Trust Layer
Tutorial · May 18 · 11 min

Multimodal Retrieval Pipeline with LlamaIndex AgentMesh Trust Layer

Per-Generation LLM Cost Tracking with Langfuse v4 SDK
Tutorial · May 18 · 10 min

Per-Generation LLM Cost Tracking with Langfuse v4 SDK

Wiring OpenTelemetry Spans into a LangGraph Agent for Cost Attribution
Tutorial · May 18 · 8 min

Wiring OpenTelemetry Spans into a LangGraph Agent for Cost Attribution

Wiring OpenTelemetry Spans into a LightRAG Retrieval Pipeline
Tutorial · May 18 · 7 min

Wiring OpenTelemetry Spans into a LightRAG Retrieval Pipeline

Tools this week View all →
Plumb 0.4
May 13
Native eval traces, OTLP exporter
lattice/agents v1.2
May 12
Replay scrubber, branch comparison
skiff-trace 0.9
May 10
Lightweight OTel for tool-using agents
harness-kit 2026.5
May 09
Dockerised runners with cgroup limits