What is Learning-to-Defer?

Learning-to-Defer is a decision framework where a model is allowed to say: “I will answer,” or “I will pass this to an expert.” The key is that deferring is not treated as failure—it is a deliberate action that can improve overall system performance when mistakes are costly or when some inputs require specialized expertise.

In practice, L2D turns prediction into a routing problem: for each query, route it either to a cheap/fast model or to an expert (a human, a stronger model, or a tool), trading off accuracy, cost (money/latency/human time), and risk.

Learning-to-Defer model diagram

My focus

My work studies L2D from a theoretical and practical angle: how to design routing rules with formal guarantees (e.g., consistency relative to the best defer/act policy in a hypothesis class), and how to make these systems reliable under limited feedback, distribution shift, and multiple experts.

Concrete examples

  • Healthcare triage: a model suggests a diagnosis for routine cases, but defers uncertain or rare patterns to a clinician.
  • Customer support: a chatbot answers common questions; it routes billing disputes or policy exceptions to a human agent.
  • GenAI content assistance: an LLM drafts routine emails or summaries; if the prompt is ambiguous, high-stakes (legal/medical), or it detects low confidence, it defers to a human editor for verification and final wording.
  • Finance (stock prediction/trading): a model issues a buy/sell/hold signal under normal market conditions; during earnings releases, macro shocks, or when uncertainty/risk is high, it defers to a risk manager (or a stricter rule-based guardrail) before any trade is executed.

Why this is useful

  • Safer automation: reduce catastrophic errors by escalating the right cases.
  • Efficient use of experts: spend human/compute budget only where it helps most.
  • Measurable trade-offs: explicitly control accuracy vs cost vs latency.
  • Guarantees (when assumptions hold): provide conditions under which the learned routing approaches an optimal defer/act policy.