534 Episoade

  1. Active Learning for Adaptive In-Context Prompt Design

    Publicat: 03.04.2025
  2. Visual Chain-of-Thought Reasoning for Vision-Language-Action Models

    Publicat: 03.04.2025
  3. On the Biology of a Large Language Model

    Publicat: 01.04.2025
  4. Async-TB: Asynchronous Trajectory Balance for Scalable LLM RL

    Publicat: 01.04.2025
  5. Instacart's Economics Team: A Hybrid Role in Tech

    Publicat: 31.03.2025
  6. Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework

    Publicat: 31.03.2025
  7. Why MCP won

    Publicat: 31.03.2025
  8. SWEET-RL: Training LLM Agents for Collaborative Reasoning

    Publicat: 31.03.2025
  9. TheoryCoder: Bilevel Planning with Synthesized World Models

    Publicat: 30.03.2025
  10. Driving Forces in AI: Scaling to 2025 and Beyond (Jason Wei, OpenAI)

    Publicat: 29.03.2025
  11. Expert Demonstrations for Sequential Decision Making under Heterogeneity

    Publicat: 28.03.2025
  12. TextGrad: Backpropagating Language Model Feedback for Generative AI Optimization

    Publicat: 27.03.2025
  13. MemReasoner: Generalizing Language Models on Reasoning-in-a-Haystack Tasks

    Publicat: 27.03.2025
  14. RAFT: In-Domain Retrieval-Augmented Fine-Tuning for Language Models

    Publicat: 27.03.2025
  15. Inductive Biases for Exchangeable Sequence Modeling

    Publicat: 26.03.2025
  16. InverseRLignment: LLM Alignment via Inverse Reinforcement Learning

    Publicat: 26.03.2025
  17. Prompt-OIRL: Offline Inverse RL for Query-Dependent Prompting

    Publicat: 26.03.2025
  18. Alignment from Demonstrations for Large Language Models

    Publicat: 25.03.2025
  19. Q♯: Distributional RL for Optimal LLM Post-Training

    Publicat: 18.03.2025
  20. Scaling Test-Time Compute Without Verification or RL is Suboptimal

    Publicat: 14.03.2025

26 / 27

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.

Visit the podcast's native language site