Skip to main content
Ladder AI is a modular framework for self-improving Large Language Models (LLMs) via recursive problem decomposition and test-time reinforcement learning. This project is a reimplementation and extension of the ideas from the paper: LADDER: Self-Improving LLMs Through Recursive Problem Decomposition. Workflow Light

Why Ladder?

  1. No Curated Datasets or Human Feedback Needed
    Ladder empowers LLMs to improve autonomously by generating and solving their own progressively simpler variants of complex problems, eliminating the need for labeled datasets or human intervention.
  2. Structured Self-Learning and Curriculum Creation
    The recursive decomposition process forms a natural difficulty gradient, allowing the model to build a curriculum tailored to its current capabilities and learn incrementally.
  3. Test-Time Reinforcement Learning (TTRL)
    Ladder extends to inference time, where models dynamically generate and solve new variants of test problems, achieving state-of-the-art results—such as 90% accuracy on the MIT Integration Bee, outperforming even OpenAI’s o1 model.
  4. Generalizable and Cost-Effective
    The approach is domain-agnostic and can be applied wherever formal verification is possible, including math, programming, and theorem proving. It is highly scalable and cost-effective, as it does not require external data or expensive human annotation.
  5. Moves Beyond Naive Scaling
    Ladder demonstrates that strategic self-improvement and recursive learning can unlock new capabilities in LLMs, challenging the notion that bigger models alone drive progress.

Setting up

Get started with your own experiments and development:

Next Steps

Explore the documentation to learn about dataset generation, engines, finetuning, deployment, and more.
Ready to dive in? Start with the Quickstart guide!